Word Segmentation in Sentence Analysis
نویسندگان
چکیده
This paper presents a model of language processing where word segmentation is an integral part of sentence analysis. We show that the use of a parser can enable us to achieve the best ambiguity resolution in word segmentation. The lexical component of this model resolves most of the ambiguities, but the final disambiguation takes place in the parsing process. In this model, word segmentation is a by-product of sentence analysis, where the correct segmentation is represented by the leaves of a parse tree. We also show that the complexity usually associated with the use of a parser in segmentation can be reduced dramatically by using a dictionary that contains useful information on word segmentation. With the aid of such information, the sentence analysis process is reasonably fast and does not suffer from the problems other people have encountered. The model is implemented in NLPWin, the general-purpose language understanding system developed at Microsoft Research. A demo of the system is available.
منابع مشابه
Thoughts on Word and Sentence Segmentation in Thai
This paper discusses problems of word and sentence segmentation in Thai. Disagreements on word segmentation are caused mostly from compound words. To set a standard resource and tool of word segmentation, we suggest that only simple words and true compound words should be segmented in the process of word segmentation. Other compounds can be grouped later by the same means as multiword identific...
متن کاملNonparametric Word Segmentation for Machine Translation
We present an unsupervised word segmentation model for machine translation. The model uses existing monolingual segmentation techniques and models the joint distribution over source sentence segmentations and alignments to the target sentence. During inference, the monolingual segmentation model and the bilingual word alignment model are coupled so that the alignments to the target sentence gui...
متن کاملEvaluating the Success of the Visual Learners in Vocabulary Learning through Word List versus Sentence Making Approaches
Thisstudy sought to evaluate the learners' achievements with the visual learning style when exposed to the sentence making and word list approaches. On that account, 45 basic level participants who studied at the Iran Language Institute (ILI), Bushehr, took part in this research study. At the outset, the learners were given Barsch learning style inventory (1991) to determine the learners' learn...
متن کاملEvaluating the Success of the Visual Learners in Vocabulary Learning through Word List versus Sentence Making Approaches.
Thisstudy sought to evaluate the learners'''' achievements with the visual learning style when exposed to the sentence making and word list approaches. On that account, 45 basic level participants who studied at the Iran Language Institute (ILI), Bushehr, took part in this research study. At the outset, the learners were given Barsch learning style inventory (1991) to determine the learners''''...
متن کاملError analysis and confidence measure of Chinese word segmentation
Word segmentation for a Chinese sentence is essential for many applications in language and speech processing. There’s no perfect method that could achieve word segmentation without any errors. We propose a confidence measure for the segmentation result to cope with the problem caused by the errors. The effective method depends mainly on the error analysis of the word segmentation. With the con...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2003